UNICEF Data Story: Global Child Vulnerability & DTP Vaccination
Spring 2025 BAA1030 Data Analytics & Story Telling (20074)
Student Name: Isha Tanwar
Student ID: 48614
Programme: MSc in Management (Strategy)
Thanks to Professor Dr. Damien Dupre, Dublin City University, for his unwavering guidance and support.
Code
import pandas as pdfrom plotnine import*from sklearn.preprocessing import MinMaxScalerimport plotly.graph_objects as go# Load datasetmerged_df = pd.read_csv("merged_final.csv")# Derived columnmerged_df["OrphanRate_per_1000"] = ( merged_df["OrphanCount"] / merged_df["Population, total"]) *1000# Rename for conveniencemerged_df = merged_df.rename(columns={"GDP per capita (constant 2015 US$)": "GDP","Life expectancy at birth, total (years)": "LifeExpectancy"})
Executive Summary
This report explores global disparities in child vulnerability using data from UNICEF and the World Bank.
It combines key indicators — GDP, immunization rates, orphan counts, and life expectancy — into a composite index.
Visual analytics highlight countries most at risk, helping align interventions with SDG 3 (Health) and SDG 10 (Inequality).
Introduction
This report provides a data-driven exploration of global child vulnerability, focusing on DTP (Diphtheria, Tetanus, and Pertussis) vaccination coverage. Using datasets from UNICEF and the World Bank, this analysis combines economic, health, and demographic indicators to expose inequities in child well-being.
Insight: Countries with higher GDP per capita tend to achieve better DTP vaccination rates. This suggests economic capacity plays a pivotal role in health system performance and immunization reach.
Bar Chart: Top 10 Countries by Orphanhood
Code
import pandas as pdfrom plotnine import*# Get the most recent year in your datasetlatest_year = merged_df["Year"].max()# Filter for top 10 countries by OrphanCountbar_data = ( merged_df[merged_df["Year"] == latest_year] .dropna(subset=["OrphanCount"]) .sort_values(by="OrphanCount", ascending=False) .head(10))# Format OrphanCount as readable labels like "13.6M"def format_millions(val):returnf"{val /1_000_000:.1f}M"bar_data["Label"] = bar_data["OrphanCount"].apply(format_millions)# Plot with nice labelsbar_chart = ( ggplot(bar_data, aes(x="reorder(Country, OrphanCount)", y="OrphanCount")) + geom_bar(stat="identity", fill="#f15b42", width=0.7) + geom_text( aes(label="Label"), format_string="{:}", nudge_y=2e5, # small offset to move label slightly right size=11, color="black", ha="left" ) + coord_flip() + labs( title=f"Top 10 Countries by Orphanhood ({latest_year})", x="Country", y="Number of Orphaned Children" ) + theme_minimal() + theme( figure_size=(14, 8), axis_text=element_text(size=14), plot_title=element_text(size=18, weight="bold"), axis_title=element_text(size=15) ))bar_chart.draw()
Insight: Countries like Nigeria, DR Congo, and Pakistan top the orphanhood chart, reflecting the heavy burden of conflict, disease, and poverty.
This visualization underscores how systemic crises directly affect children’s lives — a call to accelerate child protection and family-strengthening programs in high-risk regions.
Time Series: Orphanhood Trends (Multi-Country Normalized)
Insight: While absolute orphan counts vary, the normalized trends show that countries like Somalia and Yemen consistently exhibit high orphanhood rates per 1000 children. This reveals not just scale, but intensity of vulnerability, adjusted for population size.
Faceted View: Orphanhood Trends by Country
Code
facet_chart = ( ggplot(ts_data, aes(x="Year", y="OrphanRate_per_1000")) + geom_line(color="#c778a2", size=1.2) + geom_point(color="#af4e88", size=2) + facet_wrap("~Country") + labs( title="Orphanhood Trends by Country (Faceted)", x="Year", y="Orphan Rate per 1000" ) + theme_minimal())facet_chart.draw()
Insight: Country-specific plots make it easier to spot individual trends—e.g., a steady rise in Ukraine vs. fluctuations in Nigeria. This enables policymakers to tailor child protection strategies by context.
World Map: DTP Vaccination Coverage (Rotating Globe)
Insight: The globe highlights stark disparities: Western Europe and North America show high vaccination coverage, while parts of Sub-Saharan Africa and South Asia lag behind. These spatial gaps in access reflect systemic inequities that must be addressed to achieve SDG 3 (Health for All).
Code
indicators = merged_df[["DTP", "GDP", "LifeExpectancy", "OrphanRate_per_1000"]].dropna()# Invert values so high = more vulnerableinverted = indicators.copy()inverted["DTP"] =100- inverted["DTP"]inverted["GDP"] = inverted["GDP"].max() - inverted["GDP"]inverted["LifeExpectancy"] = inverted["LifeExpectancy"].max() - inverted["LifeExpectancy"]# Normalize and compute CVIscaler = MinMaxScaler(feature_range=(0, 100))normalized = scaler.fit_transform(inverted)merged_df.loc[indicators.index, "Child Vulnerability Index"] = normalized.mean(axis=1)# Reattach CVI to dataframemerged_df.loc[indicators.index, "Child Vulnerability Index"] = normalized.mean(axis=1)# Create CVI data subset for plottingcvi_data = merged_df.dropna(subset=["Child Vulnerability Index"]).copy()cvi_data = cvi_data.rename(columns={"Child Vulnerability Index": "CVI"})
Insight: The CVI shows which countries face the gravest challenges in child well-being. Factors like low GDP, low immunization, and high orphan rates converge to elevate vulnerability. The bottom 10 countries—many of them in Africa—need holistic interventions across health, economic, and social sectors.
Simulated DTP Coverage (based on GDP)
Code
import pandas as pdimport altair as altfrom sklearn.linear_model import LinearRegressionfrom sklearn.preprocessing import MinMaxScaler# 1. Load Datadf = pd.read_csv("merged_final.csv")# 2. Rename Columns Properlydf = df.rename(columns={"GDP per capita (constant 2015 US$)": "GDP","Life expectancy at birth, total (years)": "LifeExpectancy"})# 3. Filter for years 2014-2024df = df[df["Year"].between(2014, 2024)]# 4. Create OrphanRateif"OrphanRate_per_1000"notin df.columns: df["OrphanRate_per_1000"] = (df["OrphanCount"] / df["Population, total"]) *1000# 5. Compute CVIcvi_fields = ["DTP", "GDP", "LifeExpectancy", "OrphanRate_per_1000"]cvi_df = df.dropna(subset=cvi_fields)[["Country"] + cvi_fields].copy()inv = cvi_df[cvi_fields].copy()inv["DTP"] =100- inv["DTP"]inv["GDP"] = inv["GDP"].max() - inv["GDP"]inv["LifeExpectancy"] = inv["LifeExpectancy"].max() - inv["LifeExpectancy"]scaler = MinMaxScaler((0, 100))normalized = scaler.fit_transform(inv)cvi_df["CVI"] = normalized.mean(axis=1)# 6. Bottom 10 countries by CVIbottom_countries = ( cvi_df.groupby("Country")["CVI"] .mean() .sort_values(ascending=False) .head(10) .index.tolist())# 7. Filter latest year for these countrieslatest_year = df["Year"].max()latest_df = df[(df["Year"] == latest_year) & (df["Country"].isin(bottom_countries))]# 8. Fit Linear Regression Model on GDP vs DTPtrain_df = df.dropna(subset=["GDP", "DTP"])model = LinearRegression()model.fit(train_df[["GDP"]], train_df["DTP"])# 9. Simulate 12% GDP increaselatest_df["GDP_Increased"] = latest_df["GDP"] *1.12# 10. ⚡ Drop NaN before Predictinglatest_df = latest_df.dropna(subset=["GDP_Increased"])# 11. Predict Simulated DTPgdp_for_prediction = latest_df[["GDP_Increased"]].rename(columns={"GDP_Increased": "GDP"})latest_df["Simulated_DTP"] = model.predict(gdp_for_prediction)# 12. Prepare data for Altairplot_df = latest_df[["Country", "DTP", "Simulated_DTP"]].copy()plot_df = plot_df.melt(id_vars="Country", var_name="Type", value_name="Coverage")# 13. Horizontal Stacked Bar Chartcountry_order = latest_df.sort_values("DTP")["Country"].tolist()chart = ( alt.Chart(plot_df) .mark_bar() .encode( y=alt.Y("Country:N", sort=country_order, title="Country"), x=alt.X("Coverage:Q", stack="zero", title="DTP Coverage (%)"), color=alt.Color("Type:N", scale=alt.Scale(domain=["DTP", "Simulated_DTP"], range=["#e1a8c1", "#66bb6a"]), legend=alt.Legend(title="Coverage Type") ), tooltip=["Country", "Type", "Coverage"] ) .properties( title={"text": "Simulated DTP Coverage Based on GDP (Policy Simulator)","subtitle": "Actual vs Simulated DTP Coverage for Bottom 10 Vulnerable Countries","fontSize": 18,"subtitleFontSize": 14 }, width=350, height=250 ))chart
Insight: Countries such as Somalia, Guinea, and Nigeria demonstrate large potential gains in DTP coverage if GDP is increased moderately. Simulated scenarios empower evidence-driven investments targeting the most vulnerable regions.
SDG Alignment & Policy Implications
SDG 3: Good Health and Well-being
Investing in immunization helps build resilient health systems.
This analysis provides evidence of how economic and demographic conditions impact child vulnerability. Better policies rely on data-driven insights like these.